NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Toward Better Efficiency vs. Fidelity Tradeoffs in Web Archives

Zhu, Jingyuan Zhu; Sun, Huanchen; Madhyastha, Harsha V (October 2025, ACM SIGCOMM Internet Measurement Conference)

Operators of web archives have two options for how to crawl pages from the web. Browser-based dynamic crawlers capture all of the resources on every page, but incur high compute overheads. Static browserless crawlers are more lightweight, but miss page resources which are fetched only when scripts are executed. In this paper, we make the case that a web archive does not have to make a binary choice between dynamic or static crawling. Instead, by using a browser for a carefully chosen small subset of crawls, an archive can significantly improve its ability to serve statically crawled pages with high fidelity. First, we show how to reuse crawled resources, both across pages and across multiple crawls of the same page over time. Second, by leveraging a dynamic crawl of a page, we show that subsequent static crawls of the page can be augmented to fetch resources without executing the scripts which request them. We estimate that, as long as 8.9% of page crawls use a browser, an archive can serve roughly 99% of the remaining statically crawled pages without any loss in fidelity, up from 55% without our techniques.
more » « less
Free, publicly-accessible full text available October 28, 2026
Toward Bandwidth-adaptive Fully-Immersive Volumetric Video Conferencing

Ghosh, Rajrup; Shin, Christina; Zhang, Lei; Ye, Muyang; Jin, Tao; Madhyastha, Harsha; Netravali, Ravi; Ortega, Antonio; Rao, Sanjay; Rowe, Anthony; et al (December 2025, Proceedings of ACM CoNext 2025)

Free, publicly-accessible full text available December 2, 2026
ZENITH: Towards A Formally Verified Highly-Available Control Plane

Namyar, Pooria; Ghavidel, Arvin; Zhang, Mingyang; Madhyastha, Harsha V; Ravi, Srivatsan; Wang, Chao; Govindan, Ramesh (September 2025, ACM Digital Library)

Free, publicly-accessible full text available September 8, 2026
LiVo: Toward Bandwidth-adaptive Fully-Immersive Volumetric Video Conferencing

Ghosh, Rajrup; Shin, Christina; Zhang, Lei; Ye, Muyang; Jin, Tao; Madhyastha, Harsha; Netravali, Ravi; Ortega, Antonio; Rao, Sanjay; Rowe, Anthony; et al (October 2025, ACM CoNEXT)

Free, publicly-accessible full text available October 3, 2026
Cosmic: Cost-Effective Support for Cloud-Assisted 3D Printing

Yao, Yuan; He, Chuan; Okwudire, Chinedum; Madhyastha, Harsha V (July 2025, USENIX)

Free, publicly-accessible full text available July 7, 2026
Sprinter: Speeding Up High-Fidelity Crawling of the Modern Web

Goel, Ayush; Zhu, Jingyuan; Netravali, Ravi; Madhyastha, Harsha (April 2024, 21st USENIX Symposium on Networked Systems Design and Implementation)

Full Text Available
Auxo: Efficient Federated Learning via Scalable Client Clustering

https://doi.org/10.1145/3620678.3624651

Liu, Jiachen; Lai, Fan; Dai, Yinwei; Akella, Aditya; Madhyastha, Harsha V; Chowdhury, Mosharaf (October 2023, ACM)

Full Text Available
ModelKeeper: Accelerating DNN Training via Automated Training Warmup

Lai, Fan; Dai, Yinwei; Madhyastha, Harsha V.; Chowdhury, Mosharaf (January 2023, USENIX NSDI)

Full Text Available
Jawa: Web Archival in the Era of JavaScript

Goel, Ayush; Netravali, Ravi; Madhyastha, Harsha (July 2022, 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22))

By repeatedly crawling and saving web pages over time, web archives (such as the Internet Archive) enable users to visit historical versions of any page. In this paper, we point out that existing web archives are not well designed to cope with the widespread presence of JavaScript on the web. Some archives store petabytes of JavaScript code, and yet many pages render incorrectly when users load them. Other archives which store the end-state of page loads (e.g., screen captures) break post-load interactions implemented in JavaScript. To address these problems, we present Jawa, a new design for web archives which significantly reduces the storage necessary to save modern web pages while also improving the fidelity with which archived pages are served. Key to enabling Jawa’s use at scale are our observations on a) the forms of non-determinism which impair the execution of JavaScript on archived pages, and b) the ways in which JavaScript’s execution fundamentally differs between live web pages and their archived copies. On a corpus of 1 million archived pages, Jawa reduces overall storage needs by 41%, when compared to the techniques currently used by the Internet Archive.
more » « less
Full Text Available
FedScale: Benchmarking model and system performance of federated learning at scale

Lai, Fan; Dai, Yinwei; Singapuram, Sanjay S.; Liu, Jiachen; Zhu, Xiangfeng; Madhyastha, Harsha V.; Mosharaf Chowdhury (July 2022, ICML)

Full Text Available

« Prev Next »

Search for: All records